21 research outputs found
Comparison of Style Features for the Authorship Verification of Literary Texts
The article compares character-level, word-level, and rhythm features for the authorship verification of literary texts of the 19th-21st centuries. Text corpora contains fragments of novels, each fragment has a size of about 50 000 characters. There are 40 fragments for each author. 20 authors who wrote in English, Russian, French, and 8 Spanish-language authors are considered.The authors of this paper use existing algorithms for calculation of low-level features, popular in the computer linguistics, and rhythm features, common for the literary texts. Low-level features include n-grams of words, frequencies of letters and punctuation marks, average word and sentence lengths, etc. Rhythm features are based on lexico-grammatical figures: anaphora, epiphora, symploce, aposiopesis, epanalepsis, anadiplosis, diacope, epizeuxis, chiasmus, polysyndeton, repetitive exclamatory and interrogative sentences. These features include the frequency of occurrence of particular rhythm figures per 100 sentences, the number of unique words in the aspects of rhythm, the percentage of nouns, adjectives, adverbs and verbs in the aspects of rhythm. Authorship verification is considered as a binary classification problem: whether the text belongs to a particular author or not. AdaBoost and a neural network with an LSTM layer are considered as classification algorithms. The experiments demonstrate the effectiveness of rhythm features in verification of particular authors, and superiority of feature types combinations over single feature types on average. The best value for precision, recall, and F-measure for the AdaBoost classifier exceeds 90% when all three types of features are combined
Sentiment classification of long newspaper articles based on automatically generated thesaurus with various semantic relationships
The paper describes a new approach for sentiment classification of long texts from newspapers using an automatically generated thesaurus. An important part of the proposed approach is specialized thesaurus creation and computation of term's sentiment polarities based on relationships between terms. The approach's efficiency has been proved on a corpus of articles about American immigrants. The experiments showed that the automatically created thesaurus provides better classification quality than manual ones, and generally for this task our approach outperforms existing ones
Text Classification by Genre Based on Rhythm Features
The article is devoted to the analysis of the rhythm of texts of different genres: fiction novels, advertisements, scientific articles, reviews, tweets, and political articles. The authors identified lexico-grammatical figures in the texts: anaphora, epiphora, diacope, aposiopesis, etc., that are markers of the text rhythm. On their basis, statistical features were calculated that describe quantitatively and structurally these rhythm features.The resulting text model was visualized for statistical analysis using boxplots and heat maps that showed differences in the rhythm of texts of different genres. The boxplots showed that almost all genres differ from each other in terms of the overall density of rhythm features. Heatmaps showed different rhythm patterns across genres. Further, the rhythm features were successfully used to classify texts into six genres. The classification was carried out in two ways: a binary classification for each genre in order to separate a particular genre from the rest genres, and a multi-class classification of the text corpus into six genres at once. Two text corpora in English and Russian were used for the experiments. Each corpus contains 100 fiction novels, scientific articles, advertisements and tweets, 50 reviews and political articles, i.e. a total of 500 texts. The high quality of the classification with neural networks showed that rhythm features are a good marker for most genres, especially fiction. The experiments were carried out using the ProseRhythmDetector software tool for Russian and English languages. Text corpora contains 300 texts for each language
A survey on thesauri application in automatic natural language processing
This paper is devoted to investigate efficiency of thesauri use in popular natural language processing (NLP) fields: information retrieval and analysis of texts and subject areas. A thesaurus is a natural language resource that models a subject area and can reflect human expert's knowledge in many NLP tasks. The main target of this survey is to determine how much thesauri affect processing quality and where they can provide better performance. We describe studies that use different types of thesauri, discuss contribution of the thesaurus into achieved results, and propose directions for future research in the thesaurus field
Sentiment Classiο¬cation of Russian Texts Using Automatically Generated Thesaurus
This paper is devoted to an approach for sentiment classiο¬cation of Russian texts applying an automatic thesaurus of the subject area. This approach consists of a standard machine learning classiο¬er and a procedure embedded into it, that uses the- saurus relationships for better sentiment analysis. The thesaurus is generated fully automatically and does not require expertβs involvement into classiο¬cation process. Experiments conducted with the approach and four Russian-language text corpora, show effectiveness of thesaurus application to sentiment classiο¬cation
ΠΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΡ ΡΡΡΡΠΊΠΎΡΠ·ΡΡΠ½ΡΡ ΡΠ΅ΠΊΡΡΠΎΠ² ΠΏΠΎ ΠΆΠ°Π½ΡΠ°ΠΌ Π½Π° ΠΎΡΠ½ΠΎΠ²Π΅ ΡΠΎΠ²ΡΠ΅ΠΌΠ΅Π½Π½ΡΡ ΡΠΌΠ±Π΅Π΄Π΄ΠΈΠ½Π³ΠΎΠ² ΠΈ ΡΠΈΡΠΌΠ°
The article investigates modern vector text models for solving the problem of genre classification of Russian-language texts. Models include ELMo embeddings, BERT language model with pre-training and a complex of numerical rhythm features based on lexico-grammatical features. The experiments were carried out on a corpus of 10,000 texts in five genres: novels, scientific articles, reviews, posts from the social network Vkontakte, news from OpenCorpora. Visualization and analysis of statistics for rhythm features made it possible to identify both the most diverse genres in terms of rhythm: novels and reviews, and the least ones: scientific articles. Subsequently, these genres were classified best with the help of rhythm features and the neural network-classifier LSTM. Clustering and classifying texts by genre using ELMo and BERT embeddings made it possible to separate one genre from another with a small number of errors. The multiclassification F-score reached 99%. The study confirms the efficiency of modern embeddings in the tasks of computational linguistics, and also allows to highlight the advantages and limitations of the complex of rhythm features on the material of genre classification.Π ΡΡΠ°ΡΡΠ΅ ΠΈΡΡΠ»Π΅Π΄ΡΡΡΡΡ ΡΠΎΠ²ΡΠ΅ΠΌΠ΅Π½Π½ΡΠ΅ Π²Π΅ΠΊΡΠΎΡΠ½ΡΠ΅ ΠΌΠΎΠ΄Π΅Π»ΠΈ ΡΠ΅ΠΊΡΡΠΎΠ² Π΄Π»Ρ ΡΠ΅ΡΠ΅Π½ΠΈΡ Π·Π°Π΄Π°ΡΠΈ ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ ΡΡΡΡΠΊΠΎΡΠ·ΡΡΠ½ΡΡ
ΡΠ΅ΠΊΡΡΠΎΠ² ΠΏΠΎ ΠΆΠ°Π½ΡΠ°ΠΌ. ΠΠΎΠ΄Π΅Π»ΠΈ Π²ΠΊΠ»ΡΡΠ°ΡΡ ΡΠΌΠ±Π΅Π΄Π΄ΠΈΠ½Π³ΠΈ ELMo, ΡΠ·ΡΠΊΠΎΠ²ΡΡ ΠΌΠΎΠ΄Π΅Π»Ρ BERT Ρ ΠΏΡΠ΅Π΄ΠΎΠ±ΡΡΠ΅Π½ΠΈΠ΅ΠΌ ΠΈ ΠΊΠΎΠΌΠΏΠ»Π΅ΠΊΡ ΡΠΈΡΠ»ΠΎΠ²ΡΡ
ΡΠΈΡΠΌΠΈΡΠ΅ΡΠΊΠΈΡ
Ρ
Π°ΡΠ°ΠΊΡΠ΅ΡΠΈΡΡΠΈΠΊ Π½Π° ΠΎΡΠ½ΠΎΠ²Π΅ Π»Π΅ΠΊΡΠΈΠΊΠΎ-Π³ΡΠ°ΠΌΠΌΠ°ΡΠΈΡΠ΅ΡΠΊΠΈΡ
ΡΡΠ΅Π΄ΡΡΠ². ΠΠΊΡΠΏΠ΅ΡΠΈΠΌΠ΅Π½ΡΡ ΠΏΡΠΎΠ²ΠΎΠ΄ΠΈΠ»ΠΈΡΡ Π½Π° ΠΊΠΎΡΠΏΡΡΠ΅ ΠΈΠ· 10 000 ΡΠ΅ΠΊΡΡΠΎΠ² ΠΏΡΡΠΈ ΠΆΠ°Π½ΡΠΎΠ²: ΡΠΎΠΌΠ°Π½Ρ, Π½Π°ΡΡΠ½ΡΠ΅ ΡΡΠ°ΡΡΠΈ, ΠΎΡΠ·ΡΠ²Ρ, ΠΏΠΎΡΡΡ ΠΈΠ· ΡΠΎΡΠΈΠ°Π»ΡΠ½ΠΎΠΉ ΡΠ΅ΡΠΈ ΠΠΊΠΎΠ½ΡΠ°ΠΊΡΠ΅, Π½ΠΎΠ²ΠΎΡΡΠΈ ΠΈΠ· OpenCorpora. ΠΠΈΠ·ΡΠ°Π»ΠΈΠ·Π°ΡΠΈΡ ΠΈ Π°Π½Π°Π»ΠΈΠ· ΡΡΠ°ΡΠΈΡΡΠΈΠΊΠΈ Π΄Π»Ρ ΡΠΈΡΠΌΠΈΡΠ΅ΡΠΊΠΈΡ
Ρ
Π°ΡΠ°ΠΊΡΠ΅ΡΠΈΡΡΠΈΠΊ ΠΏΠΎΠ·Π²ΠΎΠ»ΠΈΠ»ΠΈ Π²ΡΠ΄Π΅Π»ΠΈΡΡ ΠΊΠ°ΠΊ Π½Π°ΠΈΠ±ΠΎΠ»Π΅Π΅ ΡΠ°Π·Π½ΠΎΠΎΠ±ΡΠ°Π·Π½ΡΠ΅ ΠΏΠΎ ΡΠΈΡΠΌΡ ΠΆΠ°Π½ΡΡ: ΡΠΎΠΌΠ°Π½Ρ ΠΈ ΠΎΡΠ·ΡΠ²Ρ, ΡΠ°ΠΊ ΠΈ Π½Π°ΠΈΠΌΠ΅Π½Π΅Π΅ - Π½Π°ΡΡΠ½ΡΠ΅ ΡΡΠ°ΡΡΠΈ. ΠΠΌΠ΅Π½Π½ΠΎ ΡΡΠΈ ΠΆΠ°Π½ΡΡ Π±ΡΠ»ΠΈ Π²ΠΏΠΎΡΠ»Π΅Π΄ΡΡΠ²ΠΈΠΈ ΠΊΠ»Π°ΡΡΠΈΡΠΈΡΠΈΡΠΎΠ²Π°Π½Ρ Π»ΡΡΡΠ΅ Π²ΡΠ΅Π³ΠΎ Ρ ΠΏΠΎΠΌΠΎΡΡΡ ΡΠΈΡΠΌΠ° ΠΈ Π½Π΅ΠΉΡΠΎΡΠ΅ΡΠΈ-ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΎΡΠ° LSTM. ΠΠ»Π°ΡΡΠ΅ΡΠΈΠ·Π°ΡΠΈΡ ΠΈ ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΡ ΡΠ΅ΠΊΡΡΠΎΠ² ΠΏΠΎ ΠΆΠ°Π½ΡΠ°ΠΌ Ρ ΠΏΠΎΠΌΠΎΡΡΡ ΡΠΌΠ±Π΅Π΄Π΄ΠΈΠ½Π³ΠΎΠ² ELMo ΠΈ BERT ΠΏΠΎΠ·Π²ΠΎΠ»ΠΈΠ»Π° ΠΎΡΠ΄Π΅Π»ΠΈΡΡ ΠΎΠ΄ΠΈΠ½ ΠΆΠ°Π½Ρ ΠΎΡ Π΄ΡΡΠ³ΠΎΠ³ΠΎ Ρ Π½Π΅Π±ΠΎΠ»ΡΡΠΈΠΌ ΠΊΠΎΠ»ΠΈΡΠ΅ΡΡΠ²ΠΎΠΌ ΠΎΡΠΈΠ±ΠΎΠΊ. F-ΠΌΠ΅ΡΠ° ΠΌΡΠ»ΡΡΠΈΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ Π΄ΠΎΡΡΠΈΠ³Π»Π° 99%. ΠΡΡΠ»Π΅Π΄ΠΎΠ²Π°Π½ΠΈΠ΅ ΠΏΠΎΠ΄ΡΠ²Π΅ΡΠΆΠ΄Π°Π΅Ρ ΡΡΡΠ΅ΠΊΡΠΈΠ²Π½ΠΎΡΡΡ ΡΠΎΠ²ΡΠ΅ΠΌΠ΅Π½Π½ΡΡ
ΡΠΌΠ±Π΅Π΄Π΄ΠΈΠ½Π³ΠΎΠ² Π² Π·Π°Π΄Π°ΡΠ°Ρ
ΠΊΠΎΠΌΠΏΡΡΡΠ΅ΡΠ½ΠΎΠΉ Π»ΠΈΠ½Π³Π²ΠΈΡΡΠΈΠΊΠΈ, Π° ΡΠ°ΠΊΠΆΠ΅ ΠΏΠΎΠ·Π²ΠΎΠ»ΡΠ΅Ρ Π²ΡΠ΄Π΅Π»ΠΈΡΡ Π΄ΠΎΡΡΠΎΠΈΠ½ΡΡΠ²Π° ΠΈ ΠΎΠ³ΡΠ°Π½ΠΈΡΠ΅Π½ΠΈΡ ΠΊΠΎΠΌΠΏΠ»Π΅ΠΊΡΠ° ΡΠΈΡΠΌΠΈΡΠ΅ΡΠΊΠΈΡ
Ρ
Π°ΡΠ°ΠΊΡΠ΅ΡΠΈΡΡΠΈΠΊ Π½Π° ΠΌΠ°ΡΠ΅ΡΠΈΠ°Π»Π΅ ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ ΠΏΠΎ ΠΆΠ°Π½ΡΠ°ΠΌ
Sentiment Classification into Three Classes Applying Multinomial Bayes Algorithm, N-grams, and Thesaurus
The paper is devoted to development of the method that classi?es texts in English and Russian by sentiments into positive, negative, and neutral. The proposed method is based on the Multinomial Naive Bayes classi?er with additional n-grams application. The classi?er is trained either on three classes, or on two contrasting classes with a threshold to separate neutral texts. Experiments with texts on various topics showed signi?cant improvement of classification quality for reviews from a particular domain. Besides, the analysis of thesaurus relationships application to sentiment classification into three classes was done, however it did not show significant improvement of the classification results
ΠΠ½Π°Π»ΠΈΠ· ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΡ ΡΠ°Π·Π»ΠΈΡΠ½ΡΡ ΡΠΈΠΏΠΎΠ² ΡΠ²ΡΠ·Π΅ΠΉ ΠΌΠ΅ΠΆΠ΄Ρ ΡΠ΅ΡΠΌΠΈΠ½Π°ΠΌΠΈ ΡΠ΅Π·Π°ΡΡΡΡΠ°, ΡΠ³Π΅Π½Π΅ΡΠΈΡΠΎΠ²Π°Π½Π½ΠΎΠ³ΠΎ Ρ ΠΏΠΎΠΌΠΎΡΡΡ Π³ΠΈΠ±ΡΠΈΠ΄Π½ΡΡ ΠΌΠ΅ΡΠΎΠ΄ΠΎΠ², Π² Π·Π°Π΄Π°ΡΠ°Ρ ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ ΡΠ΅ΠΊΡΡΠΎΠ²
The main purpose of the article is to analyze how effectively different types of thesaurus relations can be used for solutions of text classification tasks. The basis of the study is an automatically generated thesaurus of a subject area, that contains three types of relations: synonymous, hierarchical and associative. To generate the thesaurus the authors use a hybrid method based on several linguistic and statistical algorithms for extraction of semantic relations. The method allows to create a thesaurus with a sufficiently large number of terms and relations among them. The authors consider two problems: topical text classification and sentiment classification of large newspaper articles. To solve them, the authors developed two approaches that complement standard algorithms with a procedure that take into account thesaurus relations to determine semantic features of texts. The approach to topical classification includes the standard unsupervised BM25 algorithm and the procedure, that take into account synonymous and hierarchical relations of the thesaurus of the subject area. The approach to sentiment classification consists of two steps. At the first step, a thesaurus is created, whose termsΒ weight polarities are calculated depending on the term occurrences in the training set or on the weights of related thesaurus terms. At the second step, the thesaurus is used to compute the features of words from texts and to classify texts by the algorithm SVM or Naive Bayes. In experiments with text corpora BBCSport, Reuters, PubMed and the corpus of articles about American immigrants, the authors varied the types of thesaurus relations that are involved in the classification and the degree of their use. The results of the experiments make it possible to evaluate the efficiency of the application of thesaurus relations for classification of raw texts and to determine under what conditions certain relationships affect more or less. In particular, the most useful thesaurus connections are synonymous and hierarchical, as they provide a better quality of classification.Β Π¦Π΅Π»Ρ Π΄Π°Π½Π½ΠΎΠΉ ΡΡΠ°ΡΡΠΈ β ΠΏΡΠΎΠ°Π½Π°Π»ΠΈΠ·ΠΈΡΠΎΠ²Π°ΡΡ, Π½Π°ΡΠΊΠΎΠ»ΡΠΊΠΎ ΡΡΡΠ΅ΠΊΡΠΈΠ²Π½ΠΎ ΠΌΠΎΠ³ΡΡ ΠΏΡΠΈΠΌΠ΅Π½ΡΡΡΡΡ ΡΠ°Π·Π»ΠΈΡΠ½ΡΠ΅ ΡΠΈΠΏΡ ΡΠ΅Π·Π°ΡΡΡΡΠ½ΡΡ
ΡΠ²ΡΠ·Π΅ΠΉ Π² Π·Π°Π΄Π°ΡΠ°Ρ
ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ ΡΠ΅ΠΊΡΡΠΎΠ². ΠΡΠ½ΠΎΠ²ΠΎΠΉ ΠΈΡΡΠ»Π΅Π΄ΠΎΠ²Π°Π½ΠΈΡ ΡΠ²Π»ΡΠ΅ΡΡΡ Π°Π²ΡΠΎΠΌΠ°ΡΠΈΡΠ΅ΡΠΊΠΈ ΡΠ³Π΅Π½Π΅ΡΠΈΡΠΎΠ²Π°Π½Π½ΡΠΉ ΡΠ΅Π·Π°ΡΡΡΡ ΠΏΡΠ΅Π΄ΠΌΠ΅ΡΠ½ΠΎΠΉ ΠΎΠ±Π»Π°ΡΡΠΈ, ΡΠΎΠ΄Π΅ΡΠΆΠ°ΡΠΈΠΉ ΡΡΠΈ ΡΠΈΠΏΠ° ΡΠ²ΡΠ·Π΅ΠΉ: ΡΠΈΠ½ΠΎΠ½ΠΈΠΌΠΈΡΠ΅ΡΠΊΠΈΠ΅, ΠΈΠ΅ΡΠ°ΡΡ
ΠΈΡΠ΅ΡΠΊΠΈΠ΅ ΠΈ Π°ΡΡΠΎΡΠΈΠ°ΡΠΈΠ²Π½ΡΠ΅. ΠΠ»Ρ Π³Π΅Π½Π΅ΡΠ°ΡΠΈΠΈ ΡΠ΅Π·Π°ΡΡΡΡΠ° ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΠ΅ΡΡΡ Π³ΠΈΠ±ΡΠΈΠ΄Π½ΡΠΉ ΠΌΠ΅ΡΠΎΠ΄, ΠΎΡΠ½ΠΎΠ²Π°Π½Π½ΡΠΉ Π½Π° Π½Π΅ΡΠΊΠΎΠ»ΡΠΊΠΈΡ
Π»ΠΈΠ½Π³Π²ΠΈΡΡΠΈΡΠ΅ΡΠΊΠΈΡ
ΠΈ ΡΡΠ°ΡΠΈΡΡΠΈΡΠ΅ΡΠΊΠΈΡ
Π°Π»Π³ΠΎΡΠΈΡΠΌΠ°Ρ
Π²ΡΠ΄Π΅Π»Π΅Π½ΠΈΡ ΡΠ΅ΠΌΠ°Π½ΡΠΈΡΠ΅ΡΠΊΠΈΡ
ΡΠ²ΡΠ·Π΅ΠΉ ΠΈ ΠΏΠΎΠ·Π²ΠΎΠ»ΡΡΡΠΈΠΉ ΡΠΎΠ·Π΄Π°ΡΡ ΡΠ΅Π·Π°ΡΡΡΡ Ρ Π΄ΠΎΡΡΠ°ΡΠΎΡΠ½ΠΎ Π±ΠΎΠ»ΡΡΠΈΠΌ ΡΠΈΡΠ»ΠΎΠΌ ΡΠ΅ΡΠΌΠΈΠ½ΠΎΠ² ΠΈ ΡΠ²ΡΠ·Π΅ΠΉ ΠΌΠ΅ΠΆΠ΄Ρ Π½ΠΈΠΌΠΈ. ΠΠ²ΡΠΎΡΡ ΡΠ°ΡΡΠΌΠ°ΡΡΠΈΠ²Π°ΡΡ Π΄Π²Π΅ Π·Π°Π΄Π°ΡΠΈ: ΡΠ΅ΠΌΠ°ΡΠΈΡΠ΅ΡΠΊΠ°Ρ ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΡ ΡΠ΅ΠΊΡΡΠΎΠ² ΠΈ ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΡ Π±ΠΎΠ»ΡΡΠΈΡ
Π½ΠΎΠ²ΠΎΡΡΠ½ΡΡ
ΡΡΠ°ΡΠ΅ΠΉ ΠΏΠΎ ΡΠΎΠ½Π°Π»ΡΠ½ΠΎΡΡΠΈ. ΠΠ»Ρ ΡΠ΅ΡΠ΅Π½ΠΈΡ ΠΊΠ°ΠΆΠ΄ΠΎΠΉ ΠΈΠ· Π½ΠΈΡ
Π°Π²ΡΠΎΡΠ°ΠΌΠΈ Π±ΡΠ»ΠΈ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½Ρ Π΄Π²Π° ΠΏΠΎΠ΄Ρ
ΠΎΠ΄Π°, ΠΊΠ°ΠΆΠ΄ΡΠΉ ΠΈΠ· ΠΊΠΎΡΠΎΡΡΡ
Π΄ΠΎΠΏΠΎΠ»Π½ΡΠ΅Ρ ΡΡΠ°Π½Π΄Π°ΡΡΠ½ΡΠ΅ Π°Π»Π³ΠΎΡΠΈΡΠΌΡ ΠΏΡΠΎΡΠ΅Π΄ΡΡΠΎΠΉ, ΠΏΡΠΈΠΌΠ΅Π½ΡΡΡΠ΅ΠΉ ΡΠ²ΡΠ·ΠΈ ΡΠ΅Π·Π°ΡΡΡΡΠ° Π΄Π»Ρ ΠΎΠΏΡΠ΅Π΄Π΅Π»Π΅Π½ΠΈΡ ΡΠ΅ΠΌΠ°Π½ΡΠΈΡΠ΅ΡΠΊΠΈΡ
ΠΎΡΠΎΠ±Π΅Π½Π½ΠΎΡΡΠ΅ΠΉ ΡΠ΅ΠΊΡΡΠΎΠ². ΠΠΎΠ΄Ρ
ΠΎΠ΄ ΠΊ ΡΠ΅ΠΌΠ°ΡΠΈΡΠ΅ΡΠΊΠΎΠΉ ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ Π²ΠΊΠ»ΡΡΠ°Π΅Ρ Π² ΡΠ΅Π±Ρ ΡΡΠ°Π½Π΄Π°ΡΡΠ½ΡΠΉ Π°Π»Π³ΠΎΡΠΈΡΠΌ BM25 Π²ΠΈΠ΄Π° Β«ΠΎΠ±ΡΡΠ΅Π½ΠΈΠ΅ Π±Π΅Π· ΡΡΠΈΡΠ΅Π»ΡΒ» ΠΈ ΠΏΡΠΎΡΠ΅Π΄ΡΡΡ, ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΡΡΡΡ ΡΠΈΠ½ΠΎΠ½ΠΈΠΌΠΈΡΠ΅ΡΠΊΠΈΠ΅ ΠΈ ΠΈΠ΅ΡΠ°ΡΡ
ΠΈΡΠ΅ΡΠΊΠΈΠ΅ ΡΠ²ΡΠ·ΠΈ ΡΠ΅Π·Π°ΡΡΡΡΠ° ΠΏΡΠ΅Π΄ΠΌΠ΅ΡΠ½ΠΎΠΉ ΠΎΠ±Π»Π°ΡΡΠΈ. ΠΠΎΠ΄Ρ
ΠΎΠ΄ ΠΊ ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ ΠΏΠΎ ΡΠΎΠ½Π°Π»ΡΠ½ΠΎΡΡΠΈ ΡΠΎΡΡΠΎΠΈΡ ΠΈΠ· Π΄Π²ΡΡ
ΡΠ°Π³ΠΎΠ². ΠΠ° ΠΏΠ΅ΡΠ²ΠΎΠΌ ΡΠ°Π³Π΅ ΡΠΎΠ·Π΄Π°Π΅ΡΡΡ ΡΠ΅Π·Π°ΡΡΡΡ, ΡΠΎΠ½Π°Π»ΡΠ½ΡΠ΅ Π²Π΅ΡΠ° ΡΠ΅ΡΠΌΠΈΠ½ΠΎΠ² ΠΊΠΎΡΠΎΡΠΎΠ³ΠΎ ΡΡΠΈΡΠ°ΡΡΡΡ Π² Π·Π°Π²ΠΈΡΠΈΠΌΠΎΡΡΠΈ ΠΎΡ ΡΠ°ΡΡΠΎΡΡ Π²ΡΡΡΠ΅ΡΠ°Π΅ΠΌΠΎΡΡΠΈ Π² ΠΎΠ±ΡΡΠ°Π΅ΠΌΠΎΠΉ Π²ΡΠ±ΠΎΡΠΊΠ΅ ΠΈΠ»ΠΈ ΠΎΡ Π²Π΅ΡΠ° ΡΠΎΡΠ΅Π΄Π΅ΠΉ ΠΏΠΎ ΡΠ΅Π·Π°ΡΡΡΡΡ. ΠΠ° Π²ΡΠΎΡΠΎΠΌ ΡΠ°Π³Π΅ ΡΠ΅Π·Π°ΡΡΡΡ ΠΏΡΠΈΠΌΠ΅Π½ΡΠ΅ΡΡΡ Π΄Π»Ρ Π²ΡΡΠΈΡΠ»Π΅Π½ΠΈΡ ΠΏΡΠΈΠ·Π½Π°ΠΊΠΎΠ² ΡΠ»ΠΎΠ² ΠΈΠ· ΡΠ΅ΠΊΡΡΠΎΠ² ΠΈ ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ ΡΠ΅ΠΊΡΡΠΎΠ² ΠΌΠ΅ΡΠΎΠ΄ΠΎΠΌ ΠΎΠΏΠΎΡΠ½ΡΡ
Π²Π΅ΠΊΡΠΎΡΠΎΠ² ΠΈΠ»ΠΈ Π½Π°ΠΈΠ²Π½ΡΠΌ Π±Π°ΠΉΠ΅ΡΠΎΠ²ΡΠΊΠΈΠΌ ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΎΡΠΎΠΌ. Π ΡΠΊΡΠΏΠ΅ΡΠΈΠΌΠ΅Π½ΡΠ°Ρ
Ρ ΠΊΠΎΡΠΏΡΡΠ°ΠΌΠΈ BBCSport, Reuters, PubMed ΠΈ ΠΊΠΎΡΠΏΡΡΠΎΠΌ ΡΡΠ°ΡΠ΅ΠΉ ΠΎΠ± Π°ΠΌΠ΅ΡΠΈΠΊΠ°Π½ΡΠΊΠΈΡ
ΠΈΠΌΠΌΠΈΠ³ΡΠ°Π½ΡΠ°Ρ
Π°Π²ΡΠΎΡΡ Π²Π°ΡΡΠΈΡΠΎΠ²Π°Π»ΠΈ ΡΠΈΠΏΡ ΡΠ²ΡΠ·Π΅ΠΉ, ΠΊΠΎΡΠΎΡΡΠ΅ ΡΡΠ°ΡΡΠ²ΡΡΡ Π² ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ, ΠΈ ΡΡΠ΅ΠΏΠ΅Π½Ρ ΠΈΡ
ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΡ. Π Π΅Π·ΡΠ»ΡΡΠ°ΡΡ ΡΠΊΡΠΏΠ΅ΡΠΈΠΌΠ΅Π½ΡΠΎΠ² ΠΏΠΎΠ·Π²ΠΎΠ»ΡΡΡ ΠΎΡΠ΅Π½ΠΈΡΡ ΡΡΡΠ΅ΠΊΡΠΈΠ²Π½ΠΎΡΡΡ ΠΏΡΠΈΠΌΠ΅Π½Π΅Π½ΠΈΡ ΡΠ΅Π·Π°ΡΡΡΡΠ½ΡΡ
ΡΠ²ΡΠ·Π΅ΠΉ Π΄Π»Ρ ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ ΡΠ΅ΠΊΡΡΠΎΠ² Π½Π° Π΅ΡΡΠ΅ΡΡΠ²Π΅Π½Π½ΠΎΠΌ ΡΠ·ΡΠΊΠ΅ ΠΈ ΠΎΠΏΡΠ΅Π΄Π΅Π»ΠΈΡΡ, ΠΏΡΠΈ ΠΊΠ°ΠΊΠΈΡ
ΡΡΠ»ΠΎΠ²ΠΈΡΡ
ΡΠ΅ ΠΈΠ»ΠΈ ΠΈΠ½ΡΠ΅ ΡΠ²ΡΠ·ΠΈ ΠΈΠΌΠ΅ΡΡ Π±ΠΎΠ»ΡΡΡΡ Π·Π½Π°ΡΠΈΠΌΠΎΡΡΡ. Π ΡΠ°ΡΡΠ½ΠΎΡΡΠΈ, Π½Π°ΠΈΠ±ΠΎΠ»Π΅Π΅ ΠΏΠΎΠ»Π΅Π·Π½ΡΠΌΠΈ ΡΠ΅Π·Π°ΡΡΡΡΠ½ΡΠΌΠΈ ΡΠ²ΡΠ·ΡΠΌΠΈ ΠΎΠΊΠ°Π·Π°Π»ΠΈΡΡ ΡΠΈΠ½ΠΎΠ½ΠΈΠΌΠΈΡΠ΅ΡΠΊΠΈΠ΅ ΠΈ ΠΈΠ΅ΡΠ°ΡΡ
ΠΈΡΠ΅ΡΠΊΠΈΠ΅, ΡΠ°ΠΊ ΠΊΠ°ΠΊ ΠΎΠ½ΠΈ ΠΎΠ±Π΅ΡΠΏΠ΅ΡΠΈΠ²Π°Π΅Ρ Π»ΡΡΡΠ΅Π΅ ΠΊΠ°ΡΠ΅ΡΡΠ²ΠΎ ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ.
ΠΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΡ ΡΠ΅ΠΊΡΡΠΎΠ² ΠΏΠΎ ΡΡΠΎΠ²Π½ΡΠΌ CEFR Ρ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠ΅ΠΌ ΠΌΠ΅ΡΠΎΠ΄ΠΎΠ² ΠΌΠ°ΡΠΈΠ½Π½ΠΎΠ³ΠΎ ΠΎΠ±ΡΡΠ΅Π½ΠΈΡ ΠΈ ΡΠ·ΡΠΊΠΎΠ²ΠΎΠΉ ΠΌΠΎΠ΄Π΅Π»ΠΈ BERT
This paper presents a study of the problem of automatic classification of short coherent texts (essays) in English according to the levels of the international CEFR scale. Determining the level of text in natural language is an important component of assessing students knowledge, including checking open tasks in e-learning systems. To solve this problem, vector text models were considered based on stylometric numerical features of the character, word, sentence structure levels. The classification of the obtained vectors was carried out by standard machine learning classifiers. The article presents the results of the three most successful ones: Support Vector Classifier, Stochastic Gradient Descent Classifier, LogisticRegression. Precision, recall and F-score served as quality measures. Two open text corpora, CEFR Levelled English Texts and BEA-2019, were chosen for the experiments. The best classification results for six CEFR levels and sublevels from A1 to C2 were shown by the Support Vector Classifier with F-score 67 % for the CEFR Levelled English Texts. This approach was compared with the application of the BERT language model (six different variants). The best model, bert-base-cased, provided the F-score value of 69 %. The analysis of classification errors showed that most of them are between neighboring levels, which is quite understandable from the point of view of the domain. In addition, the quality of classification strongly depended on the text corpus, that demonstrated a significant difference in F-scores during application of the same text models for different corpora. In general, the obtained results showed the effectiveness of automatic text level detection and the possibility of its practical application.Π Π΄Π°Π½Π½ΠΎΠΉ ΡΠ°Π±ΠΎΡΠ΅ ΠΏΡΠ΅Π΄ΡΡΠ°Π²Π»Π΅Π½ΠΎ ΠΈΡΡΠ»Π΅Π΄ΠΎΠ²Π°Π½ΠΈΠ΅ Π·Π°Π΄Π°ΡΠΈ Π°Π²ΡΠΎΠΌΠ°ΡΠΈΡΠ΅ΡΠΊΠΎΠΉ ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ ΠΊΠΎΡΠΎΡΠΊΠΈΡ
ΡΠ²ΡΠ·Π½ΡΡ
ΡΠ΅ΠΊΡΡΠΎΠ² (ΡΡΡΠ΅) Π½Π° Π°Π½Π³Π»ΠΈΠΉΡΠΊΠΎΠΌ ΡΠ·ΡΠΊΠ΅ ΠΏΠΎ ΡΡΠΎΠ²Π½ΡΠΌ ΠΌΠ΅ΠΆΠ΄ΡΠ½Π°ΡΠΎΠ΄Π½ΠΎΠΉ ΡΠΊΠ°Π»Ρ CEFR. ΠΠΏΡΠ΅Π΄Π΅Π»Π΅Π½ΠΈΠ΅ ΡΡΠΎΠ²Π½Ρ ΡΠ΅ΠΊΡΡΠ° Π½Π° Π΅ΡΡΠ΅ΡΡΠ²Π΅Π½Π½ΠΎΠΌ ΡΠ·ΡΠΊΠ΅ ΡΠ²Π»ΡΠ΅ΡΡΡ Π²Π°ΠΆΠ½ΠΎΠΉ ΡΠΎΡΡΠ°Π²Π»ΡΡΡΠ΅ΠΉ ΠΎΡΠ΅Π½ΠΊΠΈ Π·Π½Π°Π½ΠΈΠΉ ΡΡΠ°ΡΠΈΡ
ΡΡ, Π² ΡΠΎΠΌ ΡΠΈΡΠ»Π΅ Π΄Π»Ρ ΠΏΡΠΎΠ²Π΅ΡΠΊΠΈ ΠΎΡΠΊΡΡΡΡΡ
Π·Π°Π΄Π°Π½ΠΈΠΉ Π² ΡΠΈΡΡΠ΅ΠΌΠ°Ρ
ΡΠ»Π΅ΠΊΡΡΠΎΠ½Π½ΠΎΠ³ΠΎ ΠΎΠ±ΡΡΠ΅Π½ΠΈΡ. ΠΠ»Ρ ΡΠ΅ΡΠ΅Π½ΠΈΡ ΡΡΠΎΠΉ Π·Π°Π΄Π°ΡΠΈ Π±ΡΠ»ΠΈ ΡΠ°ΡΡΠΌΠΎΡΡΠ΅Π½Ρ Π²Π΅ΠΊΡΠΎΡΠ½ΡΠ΅ ΠΌΠΎΠ΄Π΅Π»ΠΈ ΡΠ΅ΠΊΡΡΠ° Π½Π° ΠΎΡΠ½ΠΎΠ²Π΅ ΡΡΠΈΠ»ΠΎΠΌΠ΅ΡΡΠΈΡΠ΅ΡΠΊΠΈΡ
ΡΠΈΡΠ»ΠΎΠ²ΡΡ
Ρ
Π°ΡΠ°ΠΊΡΠ΅ΡΠΈΡΡΠΈΠΊ ΡΡΠΎΠ²Π½Ρ ΡΠΈΠΌΠ²ΠΎΠ»ΠΎΠ², ΡΠ»ΠΎΠ², ΡΡΡΡΠΊΡΡΡΡ ΠΏΡΠ΅Π΄Π»ΠΎΠΆΠ΅Π½ΠΈΡ. ΠΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΡ ΠΏΠΎΠ»ΡΡΠ΅Π½Π½ΡΡ
Π²Π΅ΠΊΡΠΎΡΠΎΠ² ΠΎΡΡΡΠ΅ΡΡΠ²Π»ΡΠ»Π°ΡΡ ΡΡΠ°Π½Π΄Π°ΡΡΠ½ΡΠΌΠΈ ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΎΡΠ°ΠΌΠΈ ΠΌΠ°ΡΠΈΠ½Π½ΠΎΠ³ΠΎ ΠΎΠ±ΡΡΠ΅Π½ΠΈΡ. Π ΡΡΠ°ΡΡΠ΅ ΠΏΡΠΈΠ²Π΅Π΄Π΅Π½Ρ ΡΠ΅Π·ΡΠ»ΡΡΠ°ΡΡ ΡΡΡΡ
Π½Π°ΠΈΠ±ΠΎΠ»Π΅Π΅ ΡΡΠΏΠ΅ΡΠ½ΡΡ
: Support Vector Classifier, Stochastic Gradient Descent Classifier, LogisticRegression. ΠΡΠ΅Π½ΠΊΠΎΠΉ ΠΊΠ°ΡΠ΅ΡΡΠ²Π° ΠΏΠΎΡΠ»ΡΠΆΠΈΠ»ΠΈ ΡΠΎΡΠ½ΠΎΡΡΡ, ΠΏΠΎΠ»Π½ΠΎΡΠ° ΠΈ F"=ΠΌΠ΅ΡΠ°. ΠΠ»Ρ ΡΠΊΡΠΏΠ΅ΡΠΈΠΌΠ΅Π½ΡΠΎΠ² Π±ΡΠ»ΠΈ Π²ΡΠ±ΡΠ°Π½Ρ Π΄Π²Π° ΠΎΡΠΊΡΡΡΡΡ
ΠΊΠΎΡΠΏΡΡΠ° ΡΠ΅ΠΊΡΡΠΎΠ² CEFR Levelled English Texts ΠΈ BEA"=2019. ΠΡΡΡΠΈΠ΅ ΡΠ΅Π·ΡΠ»ΡΡΠ°ΡΡ ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ ΠΏΠΎ ΡΠ΅ΡΡΠΈ ΡΡΠΎΠ²Π½ΡΠΌ ΠΈ ΠΏΠΎΠ΄ΡΡΠΎΠ²Π½ΡΠΌ CEFR ΠΎΡ A1 Π΄ΠΎ C2 ΠΏΠΎΠΊΠ°Π·Π°Π» Support Vector Classifier Ρ F"=ΠΌΠ΅ΡΠΎΠΉ 67 % Π΄Π»Ρ ΠΊΠΎΡΠΏΡΡΠ° CEFR Levelled English Texts. ΠΡΠΎΡ ΠΏΠΎΠ΄Ρ
ΠΎΠ΄ ΡΡΠ°Π²Π½ΠΈΠ²Π°Π»ΡΡ Ρ ΠΏΡΠΈΠΌΠ΅Π½Π΅Π½ΠΈΠ΅ΠΌ ΡΠ·ΡΠΊΠΎΠ²ΠΎΠΉ ΠΌΠΎΠ΄Π΅Π»ΠΈ BERT (ΡΠ΅ΡΡΡ ΡΠ°Π·Π»ΠΈΡΠ½ΡΡ
Π²Π°ΡΠΈΠ°Π½ΡΠΎΠ²). ΠΡΡΡΠ°Ρ ΠΌΠΎΠ΄Π΅Π»Ρ bert"=base"=cased ΠΎΠ±Π΅ΡΠΏΠ΅ΡΠΈΠ»Π° Π·Π½Π°ΡΠ΅Π½ΠΈΠ΅ F"=ΠΌΠ΅ΡΡ 69 %. ΠΠ½Π°Π»ΠΈΠ· ΠΎΡΠΈΠ±ΠΎΠΊ ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ ΠΏΠΎΠΊΠ°Π·Π°Π», ΡΡΠΎ Π±ΠΎΠ»ΡΡΠ°Ρ ΠΈΡ
ΡΠ°ΡΡΡ Π΄ΠΎΠΏΡΡΠ΅Π½Π° ΠΌΠ΅ΠΆΠ΄Ρ ΡΠΎΡΠ΅Π΄Π½ΠΈΠΌΠΈ ΡΡΠΎΠ²Π½ΡΠΌΠΈ, ΡΡΠΎ Π²ΠΏΠΎΠ»Π½Π΅ ΠΎΠ±ΡΡΡΠ½ΠΈΠΌΠΎ Ρ ΡΠΎΡΠΊΠΈ Π·ΡΠ΅Π½ΠΈΡ ΠΏΡΠ΅Π΄ΠΌΠ΅ΡΠ½ΠΎΠΉ ΠΎΠ±Π»Π°ΡΡΠΈ. ΠΡΠΎΠΌΠ΅ ΡΠΎΠ³ΠΎ, ΠΊΠ°ΡΠ΅ΡΡΠ²ΠΎ ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ ΡΠΈΠ»ΡΠ½ΠΎ Π·Π°Π²ΠΈΡΠ΅Π»ΠΎ ΠΎΡ ΠΊΠΎΡΠΏΡΡΠ° ΡΠ΅ΠΊΡΡΠΎΠ², ΡΡΠΎ ΠΏΡΠΎΠ΄Π΅ΠΌΠΎΠ½ΡΡΡΠΈΡΠΎΠ²Π°Π»ΠΎ ΡΡΡΠ΅ΡΡΠ²Π΅Π½Π½ΠΎΠ΅ ΡΠ°Π·Π»ΠΈΡΠΈΠ΅ F"=ΠΌΠ΅ΡΡ Π² Ρ
ΠΎΠ΄Π΅ ΠΏΡΠΈΠΌΠ΅Π½Π΅Π½ΠΈΡ ΠΎΠ΄ΠΈΠ½Π°ΠΊΠΎΠ²ΡΡ
ΠΌΠΎΠ΄Π΅Π»Π΅ΠΉ ΡΠ΅ΠΊΡΡΠ° Π΄Π»Ρ ΡΠ°Π·Π½ΡΡ
ΠΊΠΎΡΠΏΡΡΠΎΠ². Π ΡΠ΅Π»ΠΎΠΌ, ΠΏΠΎΠ»ΡΡΠ΅Π½Π½ΡΠ΅ ΡΠ΅Π·ΡΠ»ΡΡΠ°ΡΡ ΠΏΠΎΠΊΠ°Π·Π°Π»ΠΈ ΡΡΡΠ΅ΠΊΡΠΈΠ²Π½ΠΎΡΡΡ Π°Π²ΡΠΎΠΌΠ°ΡΠΈΡΠ΅ΡΠΊΠΎΠ³ΠΎ ΠΎΠΏΡΠ΅Π΄Π΅Π»Π΅Π½ΠΈΡ ΡΡΠΎΠ²Π½Ρ ΡΠ΅ΠΊΡΡΠ° ΠΈ Π²ΠΎΠ·ΠΌΠΎΠΆΠ½ΠΎΡΡΡ Π΅Π³ΠΎ ΠΏΡΠ°ΠΊΡΠΈΡΠ΅ΡΠΊΠΎΠ³ΠΎ ΠΏΡΠΈΠΌΠ΅Π½Π΅Π½ΠΈΡ
Π ΡΡΡΠΊΠΎΡΠ·ΡΡΠ½ΡΠ΅ ΡΠ΅Π·Π°ΡΡΡΡΡ: Π°Π²ΡΠΎΠΌΠ°ΡΠΈΠ·ΠΈΡΠΎΠ²Π°Π½Π½ΠΎΠ΅ ΠΏΠΎΡΡΡΠΎΠ΅Π½ΠΈΠ΅ ΠΈ ΠΏΡΠΈΠΌΠ΅Π½Π΅Π½ΠΈΠ΅ Π² Π·Π°Π΄Π°ΡΠ°Ρ ΠΎΠ±ΡΠ°Π±ΠΎΡΠΊΠΈ ΡΠ΅ΠΊΡΡΠΎΠ² Π½Π° Π΅ΡΡΠ΅ΡΡΠ²Π΅Π½Π½ΠΎΠΌ ΡΠ·ΡΠΊΠ΅
The paper reviews the existing Russian-language thesauri in digital form and methods of their automatic construction and application. The authors analyzed the main characteristics of open access thesauri for scientific research, evaluated trends of their development, and their effectiveness in solving natural language processing tasks. The statistical and linguistic methods of thesaurus construction that allow to automate the development and reduce labor costs of expert linguists were studied. In particular, the authors considered algorithms for extracting keywords and semantic thesaurus relationships of all types, as well as the quality of thesauri generated with the use of these tools. To illustrate features of various methods for constructing thesaurus relationships, the authors developed a combined method that generates a specialized thesaurus fully automatically taking into account a text corpus in a particular domain and several existing linguistic resources. With the proposed method, experiments were conducted with two Russian-language text corpora from two subject areas: articles about migrants and tweets. The resulting thesauri were assessed by using an integrated assessment developed in the previous authorsβ study that allows to analyze various aspects of the thesaurus and the quality of the generation methods. The analysis revealed the main advantages and disadvantages of various approaches to the construction of thesauri and the extraction of semantic relationships of different types, as well as made it possible to determine directions for future study.Π ΡΠ°Π±ΠΎΡΠ΅ Π²ΡΠΏΠΎΠ»Π½Π΅Π½ ΠΎΠ±Π·ΠΎΡ ΡΡΡΠ΅ΡΡΠ²ΡΡΡΠΈΡ
ΡΠ»Π΅ΠΊΡΡΠΎΠ½Π½ΡΡ
ΡΡΡΡΠΊΠΎΡΠ·ΡΡΠ½ΡΡ
ΡΠ΅Π·Π°ΡΡΡΡΠΎΠ² ΠΈ ΠΌΠ΅ΡΠΎΠ΄ΠΎΠ² ΠΈΡ
Π°Π²ΡΠΎΠΌΠ°ΡΠΈΡΠ΅ΡΠΊΠΎΠ³ΠΎ ΠΏΠΎΡΡΡΠΎΠ΅Π½ΠΈΡ ΠΈ ΠΏΡΠΈΠΌΠ΅Π½Π΅Π½ΠΈΡ. ΠΠ²ΡΠΎΡΡ ΠΏΡΠΎΠ²Π΅Π»ΠΈ Π°Π½Π°Π»ΠΈΠ· ΠΎΡΠ½ΠΎΠ²Π½ΡΡ
Ρ
Π°ΡΠ°ΠΊΡΠ΅ΡΠΈΡΡΠΈΠΊ ΡΠ΅Π·Π°ΡΡΡΡΠΎΠ², Π½Π°Ρ
ΠΎΠ΄ΡΡΠΈΡ
ΡΡ Π² ΠΎΡΠΊΡΡΡΠΎΠΌ Π΄ΠΎΡΡΡΠΏΠ΅, Π΄Π»Ρ Π½Π°ΡΡΠ½ΡΡ
ΠΈΡΡΠ»Π΅Π΄ΠΎΠ²Π°Π½ΠΈΠΉ, ΠΎΡΠ΅Π½ΠΈΠ»ΠΈ Π΄ΠΈΠ½Π°ΠΌΠΈΠΊΡ ΠΈΡ
ΡΠ°Π·Π²ΠΈΡΠΈΡ ΠΈ ΡΡΡΠ΅ΠΊΡΠΈΠ²Π½ΠΎΡΡΡ Π² ΡΠ΅ΡΠ΅Π½ΠΈΠΈ Π·Π°Π΄Π°Ρ ΠΏΠΎ ΠΎΠ±ΡΠ°Π±ΠΎΡΠΊΠ΅ Π΅ΡΡΠ΅ΡΡΠ²Π΅Π½Π½ΠΎΠ³ΠΎ ΡΠ·ΡΠΊΠ°. ΠΡΠ»ΠΈ ΠΈΡΡΠ»Π΅Π΄ΠΎΠ²Π°Π½Ρ ΡΡΠ°ΡΠΈΡΡΠΈΡΠ΅ΡΠΊΠΈΠ΅ ΠΈ Π»ΠΈΠ½Π³Π²ΠΈΡΡΠΈΡΠ΅ΡΠΊΠΈΠ΅ ΠΌΠ΅ΡΠΎΠ΄Ρ ΠΏΠΎΡΡΡΠΎΠ΅Π½ΠΈΡ ΡΠ΅Π·Π°ΡΡΡΡΠΎΠ², ΠΊΠΎΡΠΎΡΡΠ΅ ΠΏΠΎΠ·Π²ΠΎΠ»ΡΡΡ Π°Π²ΡΠΎΠΌΠ°ΡΠΈΠ·ΠΈΡΠΎΠ²Π°ΡΡ ΡΠ°Π·ΡΠ°Π±ΠΎΡΠΊΡ ΠΈ ΡΠΌΠ΅Π½ΡΡΠΈΡΡ Π·Π°ΡΡΠ°ΡΡ Π½Π° ΡΡΡΠ΄ ΡΠΊΡΠΏΠ΅ΡΡΠΎΠ²-Π»ΠΈΠ½Π³Π²ΠΈΡΡΠΎΠ². Π ΡΠ°ΡΡΠ½ΠΎΡΡΠΈ, ΡΠ°ΡΡΠΌΠ°ΡΡΠΈΠ²Π°Π»ΠΈΡΡ Π°Π»Π³ΠΎΡΠΈΡΠΌΡ Π²ΡΠ΄Π΅Π»Π΅Π½ΠΈΡ ΠΊΠ»ΡΡΠ΅Π²ΡΡ
ΡΠ΅ΡΠΌΠΈΠ½ΠΎΠ² ΠΈΠ· ΡΠ΅ΠΊΡΡΠΎΠ² ΠΈ ΡΠ΅ΠΌΠ°Π½ΡΠΈΡΠ΅ΡΠΊΠΈΡ
ΡΠ΅Π·Π°ΡΡΡΡΠ½ΡΡ
ΡΠ²ΡΠ·Π΅ΠΉ Π²ΡΠ΅Ρ
ΡΠΈΠΏΠΎΠ², Π° ΡΠ°ΠΊΠΆΠ΅ ΠΊΠ°ΡΠ΅ΡΡΠ²ΠΎ ΠΏΡΠΈΠΌΠ΅Π½Π΅Π½ΠΈΡ ΠΏΠΎΠ»ΡΡΠΈΠ²ΡΠΈΡ
ΡΡ Π² ΡΠ΅Π·ΡΠ»ΡΡΠ°ΡΠ΅ ΠΈΡ
ΡΠ°Π±ΠΎΡΡ ΡΠ΅Π·Π°ΡΡΡΡΠΎΠ². ΠΠ»Ρ Π½Π°Π³Π»ΡΠ΄Π½ΠΎΠΉ ΠΈΠ»Π»ΡΡΡΡΠ°ΡΠΈΠΈ ΠΎΡΠΎΠ±Π΅Π½Π½ΠΎΡΡΠ΅ΠΉ ΡΠ°Π·Π»ΠΈΡΠ½ΡΡ
ΠΌΠ΅ΡΠΎΠ΄ΠΎΠ² ΠΏΠΎΡΡΡΠΎΠ΅Π½ΠΈΡ ΡΠ΅Π·Π°ΡΡΡΡΠ½ΡΡ
ΡΠ²ΡΠ·Π΅ΠΉ Π±ΡΠ» ΡΠ°Π·ΡΠ°Π±ΠΎΡΠ°Π½ ΠΊΠΎΠΌΠ±ΠΈΠ½ΠΈΡΠΎΠ²Π°Π½Π½ΡΠΉ ΠΌΠ΅ΡΠΎΠ΄, Π³Π΅Π½Π΅ΡΠΈΡΡΡΡΠΈΠΉ ΡΠΏΠ΅ΡΠΈΠ°Π»ΠΈΠ·ΠΈΡΠΎΠ²Π°Π½Π½ΡΠΉ ΡΠ΅Π·Π°ΡΡΡΡ ΠΏΠΎΠ»Π½ΠΎΡΡΡΡ Π°Π²ΡΠΎΠΌΠ°ΡΠΈΡΠ΅ΡΠΊΠΈ Π½Π° ΠΎΡΠ½ΠΎΠ²Π΅ ΠΊΠΎΡΠΏΡΡΠ° ΡΠ΅ΠΊΡΡΠΎΠ² ΠΏΡΠ΅Π΄ΠΌΠ΅ΡΠ½ΠΎΠΉ ΠΎΠ±Π»Π°ΡΡΠΈ ΠΈ Π½Π΅ΡΠΊΠΎΠ»ΡΠΊΠΈΡ
ΡΡΡΠ΅ΡΡΠ²ΡΡΡΠΈΡ
Π»ΠΈΠ½Π³Π²ΠΈΡΡΠΈΡΠ΅ΡΠΊΠΈΡ
ΡΠ΅ΡΡΡΡΠΎΠ². Π‘ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠ΅ΠΌ ΠΏΡΠ΅Π΄Π»ΠΎΠΆΠ΅Π½Π½ΠΎΠ³ΠΎ ΠΌΠ΅ΡΠΎΠ΄Π° Π±ΡΠ»ΠΈ ΠΏΡΠΎΠ²Π΅Π΄Π΅Π½Ρ ΡΠΊΡΠΏΠ΅ΡΠΈΠΌΠ΅Π½ΡΡ Ρ ΡΡΡΡΠΊΠΎΡΠ·ΡΡΠ½ΡΠΌΠΈ ΠΊΠΎΡΠΏΡΡΠ°ΠΌΠΈ ΡΠ΅ΠΊΡΡΠΎΠ² ΠΈΠ· Π΄Π²ΡΡ
ΠΏΡΠ΅Π΄ΠΌΠ΅ΡΠ½ΡΡ
ΠΎΠ±Π»Π°ΡΡΠ΅ΠΉ: ΡΡΠ°ΡΡΠΈ ΠΎ ΠΌΠΈΠ³ΡΠ°Π½ΡΠ°Ρ
ΠΈ ΡΠ²ΠΈΡΡ. ΠΠ»Ρ Π°Π½Π°Π»ΠΈΠ·Π° ΠΏΠΎΠ»ΡΡΠ΅Π½Π½ΡΡ
ΡΠ΅Π·Π°ΡΡΡΡΠΎΠ² ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π»Π°ΡΡ ΠΊΠΎΠΌΠΏΠ»Π΅ΠΊΡΠ½Π°Ρ ΠΎΡΠ΅Π½ΠΊΠ°, ΡΠ°Π·ΡΠ°Π±ΠΎΡΠ°Π½Π½Π°Ρ Π°Π²ΡΠΎΡΠ°ΠΌΠΈ Π² ΠΏΡΠ΅Π΄ΡΠ΄ΡΡΠ΅ΠΌ ΠΈΡΡΠ»Π΅Π΄ΠΎΠ²Π°Π½ΠΈΠΈ, ΠΊΠΎΡΠΎΡΠ°Ρ ΠΏΠΎΠ·Π²ΠΎΠ»ΡΠ΅Ρ ΠΎΠΏΡΠ΅Π΄Π΅Π»ΠΈΡΡ ΡΠ°Π·Π»ΠΈΡΠ½ΡΠ΅ Π°ΡΠΏΠ΅ΠΊΡΡ ΡΠ΅Π·Π°ΡΡΡΡΠ° ΠΈ ΠΊΠ°ΡΠ΅ΡΡΠ²ΠΎ ΠΌΠ΅ΡΠΎΠ΄ΠΎΠ² Π΅Π³ΠΎ Π³Π΅Π½Π΅ΡΠ°ΡΠΈΠΈ. ΠΡΠΎΠ²Π΅Π΄ΡΠ½Π½ΡΠΉ Π°Π½Π°Π»ΠΈΠ· Π²ΡΡΠ²ΠΈΠ» ΠΎΡΠ½ΠΎΠ²Π½ΡΠ΅ Π΄ΠΎΡΡΠΎΠΈΠ½ΡΡΠ²Π° ΠΈ Π½Π΅Π΄ΠΎΡΡΠ°ΡΠΊΠΈ ΡΠ°Π·Π»ΠΈΡΠ½ΡΡ
ΠΏΠΎΠ΄Ρ
ΠΎΠ΄ΠΎΠ² ΠΊ ΠΏΠΎΡΡΡΠΎΠ΅Π½ΠΈΡ ΡΠ΅Π·Π°ΡΡΡΡΠΎΠ² ΠΈ Π²ΡΠ΄Π΅Π»Π΅Π½ΠΈΡ ΡΠ΅ΠΌΠ°Π½ΡΠΈΡΠ΅ΡΠΊΠΈΡ
ΡΠ²ΡΠ·Π΅ΠΉ ΡΠ°Π·Π»ΠΈΡΠ½ΡΡ
ΡΠΈΠΏΠΎΠ², Π° ΡΠ°ΠΊΠΆΠ΅ ΠΏΠΎΠ·Π²ΠΎΠ»ΠΈΠ» ΠΎΠΏΡΠ΅Π΄Π΅Π»ΠΈΡΡ ΠΏΠΎΡΠ΅Π½ΡΠΈΠ°Π»ΡΠ½ΡΠ΅ Π½Π°ΠΏΡΠ°Π²Π»Π΅Π½ΠΈΡ Π±ΡΠ΄ΡΡΠΈΡ
ΠΈΡΡΠ»Π΅Π΄ΠΎΠ²Π°Π½ΠΈΠΉ.